Parallelizing Frequent Itemset Mining with FP-Trees

نویسندگان

  • Peiyi Tang
  • Markus P. Turkia
چکیده

A new scheme to parallelize frequent itemset mining algorithms is proposed. By using the extended conditional databases and k-prefix search space partitioning, our new scheme can create more parallel tasks with better balanced execution times. An implementation of the new scheme with FP-trees is presented. The results of the experimental evaluation showing the increased speedup are presented.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Concurrent Processing of Frequent Itemset Queries Using FP-Growth Algorithm

Discovery of frequent itemsets is a very important data mining problem with numerous applications. Frequent itemset mining is often regarded as advanced querying where a user specifies the source dataset and pattern constraints using a given constraint model. A significant amount of research on frequent itemset mining has been done so far, focusing mainly on developing faster complete mining al...

متن کامل

FP-Bonsai: The Art of Growing and Pruning Small FP-Trees

In the context of mining frequent itemsets, numerous strategies have been proposed to push several types of constraints within the most well known algorithms. In this paper, we integrate the recently proposed ExAnte data reduction technique within the FP-growth algorithm. Together, they result in a very efficient frequent itemset mining algorithm that effectively exploits monotone constraints.

متن کامل

Accelerating Closed Frequent Itemset Mining by Elimination of Null Transactions

The mining of frequent itemsets is often challenged by the length of the patterns mined and also by the number of transactions considered for the mining process. Another acute challenge that concerns the performance of any association rule mining algorithm is the presence of „null‟ transactions. This work proposes a closed frequent itemset mining algorithm viz., Closed Frequent Itemset Mining a...

متن کامل

Three Strategies for Concurrent Processing of Frequent Itemset Queries Using FP-Growth

Frequent itemset mining is often regarded as advanced querying where a user specifies the source dataset and pattern constraints using a given constraint model. Recently, a new problem of optimizing processing of sets of frequent itemset queries has been considered and two multiple query optimization techniques for frequent itemset queries: Mine Merge and Common Counting have been proposed and ...

متن کامل

Parallelizing Frequent Itemset Mining Process using High Performance Computing

Data is growing at an enormous rate and mining this data is becoming a herculean task. Association Rule mining is one of the important algorithms used in data mining and mining frequent itemset is a crucial step in this process which consumes most of the processing time. Parallelizing the algorithm at various levels of computation will not only speed up the process but will also allow it to han...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006